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Abstract 

The fc-means algorithm is a well-known method for partitioning n points that lie in the 
rf-dimensional space into k clusters. Its main features are simplicity and speed in practice. 
Theoretically, however, the best known upper bound on its running time (i.e. 0(n kd )) can be 
exponential in the number of points. Recently, Arthur and Vassilvitskii [2] showed a super- 
polynomial worst-case analysis, improving the best known lower bound from fl(n) to 2 n (v") 
with a construction in d = £l(y/n) dimensions. In [2] they also conjectured the existence of 
super-polynomial lower bounds for any d > 2. 

Our contribution is twofold: we prove this conjecture and we improve the lower bound, by 
presenting a simple construction in the plane that leads to the exponential lower bound 2 n ( n \ 



1 Introduction 



The fc-means method is one of the most widely used algorithms for geometric clustering. It 
was originally proposed by Forgy in 1965 [7j and McQueen in 1967 [13 , and is often known as 
Lloyd's algorithm [T2]. It is a local search algorithm and partitions n data points into k clusters 
in this way: seeded with k initial cluster centers, it assigns every data point to its closest center, 
and then recomputes the new centers as the means (or centers of mass) of their assigned points. 
This process of assigning data points and readjusting centers is repeated until it stabilizes. 

Despite its age, fc-means is still very popular today and is considered "by far the most 
popular clustering algorithm used in scientific and industrial applications" , as Berkhin remarks 
in his survey on data mining [3]. Its widespread usage extends over a variety of different areas, 
such as artificial intelligence, computational biology, computer graphics, just to name a few (see 
1, 8J). It is particularly popular because of its simplicity and observed speed: as Duda et al. 
say in their text on pattern classification [6 , "In practice the number of iterations is much less 
than the number of samples" . 

Even if, in practice, speed is recognized as one of fc-means' main qualities (see |llj for 
empirical studies), on the other hand there are a few theoretical bounds on its worst-case 
running time and they do not corroborate this feature. 

An upper bound of 0(k n ) can be trivially established since it can be shown that no clustering 
occurs twice during the course of the algorithm. In [10] . Inaba et al. improved this bound to 
0(n kd ) by counting the number of Voronoi partitions of n points in M. d into k classes. Other 
bounds are known for some special cases. Namely, Dasgupta [51 analyzed the case d = 1, proving 
an upper bound of 0(n) when k < 5, and a worst-case lower bound of Q(n). Later, Har-Peled 
and Sadri [9J, again for the one-dimensional case, showed an upper bound of 0(nA 2 ) where 
A is the spread of the point set (i.e. the ratio between the largest and the smallest pairwise 
distance), and conjectured that fc-means might run in time polynomial in n and A for any d. 

The upper bound 0(n kd ) for the general case has not been improved since more than a 
decade, and this suggests that it might be not far from the truth. Arthur and Vassilvitskii 
[2] showed that fc-means can run for super-polynomially many iterations, improving the best 
known lower bound from O(n) [5] to 2°'^™). Their contruction lies in a space with d = 0(logn) 
dimensions, and they leave an open question about the performance of fc-means for a smaller 
number of dimensions d, conjecturing the existence of superpolynomial lower bounds when 
d > 1. Also they show that their construction can be modified to have low spread, disproving 
the aforementioned conjecture in [9] for d — fi(logn). 

A more recent line of work that aims to close the gap between practical and theoreti- 
cal performance makes use of the smoothed analysis introduced by Spielman and Teng [15) . 
Arthur and Vassilvitskii [3] proved a smoothed upper bound of poly(n°( fc )), recently improved 
to poly(n°( v/ ^)) by Manthey and Roglin [Ti] , 

1.1 Our result 

In this work we are interested in the performance of fc-means in a low dimensional space. We 
said it is conjectured [2] that there exist instances in d dimensions for any d > 2, for which 
fc-means runs for a super-polynomial number of iterations. 

Our main result is a construction in the plane (d — 2) for which fc-means requires expo- 
nentially many iterations to stabilize. Specifically, we present a set of n data points lying in 
R 2 , and a set of k — O(n) adversarially chosen cluster centers in R 2 , for which the algorithm 
runs for 2 n (") iterations. This proves the aforementioned conjecture and, at the same time, 
it also improves the best known lower bound from 2 n (^™) to 2 n ( n \ Notice that the exponent 
is optimal disregarding logarithmic factor, since the bound for the general case 0(n kd ) can be 
rewritten as 2°(™ logn ) when d = 2 and k = 9(n). For any k — o(n), our lower bound easily 
translates to 2 n ^ k \ which, analogously, is almost optimal since the upper bound is 2°( felogn ). 

A common practice for seeding fc-means is to choose the initial centers as a subset of the 
data points. We show that even in this case (i.e. cluster centers adversarially chosen among the 
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data points), the running time of fc-means is still exponential. 

Also, using a result in [2J, our construction can be modified to an instance in d — 3 dimensions 
having low spread for which fc-means requires 2^™) iterations, which disproves the conjecture 
of Har-Peled and Sadri [9] for any d > 3. 

Finally, we observe that our result implies that the smoothed analysis helps even for a small 
number of dimensions, since the best smoothed upper bound is n°^^\ while our lower bound 
is 2 f2 w which is larger for fc = w(log 2 n). In other words, perturbing each data point and then 
running fc-means would improve the performance of the algorithm. 

2 The fc-means algorithm 

The fc-means algorithm allows to partition a set X of n points in R d into fc clusters. It is seeded 
with any initial set of fc cluster centers in M. d , and given the cluster centers, every data point is 
assigned to the cluster whose center is closer to it. The name "fc-means" refers to the fact that 
the new position of a center is computed as the center of mass (or mean point) of the points 
assigned to it. 

A formal definition of the algorithm is the following: 

0. Arbitrarily choose fc initial centers c±, C2, . . . , c/-. 

1. For each 1 < i < fc, set the cluster d be the set of points in X that are closer to c, than 
to any Cj with j =/= i. 

2. For each 1 < i < fc, set Cj = y^-j ^] xeC x, i.e the center of mass of the points in Cj. 

3. Repeat steps 1 and 2 until the clusters d and the centers Cj do not change anymore. The 
partition of X is the set of clusters C\,C<z, . . . ,Ck- 

Note that the algorithm might incur in two possibile "degenerate" situations: the first one 
is when no points are assigned to a center, and in this case that center is removed and we will 
obtain a partition with less than fc clusters. The other degeneracy is when a point is equally 
close to more than one center, and in this case the tie is broken arbitrarily. 

We stress that when fc-means runs on our constructions, it does not fall into any of these 
situations, so the lower bound does not exploit these degeneracies. 

Our construction use points that have constant integer weights. This means that the data 
set that fc-means will take in input is actually a multiset, and the center of mass of a cluster Cj 
(step 2 of fc-means) is computed as X^ec w x x l X^eC Wx > wnere w x is the weight of x. This 
is not a restriction since integer weights in the range [1, C] can be simulated by blowing up the 
size of the data set by at most C: it is enough to replace each point x of weight w with a set 
of w distinct points (of unitary weight) whose center of mass is x, and so close each other that 
the behavior of fc-means (as well as its number of iterations) is not affected. 

3 Lower bound 

In this section we present a construction in the plane for which fc-means requires 2 n (") iterations. 
We start with some high level intuition of the construction, then we give some definitions 
explaining the idea behind the construction, and finally we proceed to the formal proof. 

In the end of the section, we show a couple of extensions: the first one is a modification of 
our construction so that the initial set of centers is a subset of the data points, and the second 
one describes how to obtain low spread. 

A simple implementation in Python of the lower bound is available at the web address 
http : //www. cse .ucsd. edu/~avattani/k-means/lowerbound.py 
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Wj+i's call: Wi is awoken 



Morning 



Watching W± 



1st call 

> 

If falls 

asleep, Wi 

wakes it up 



Afternoon 



Watching W^ 



2nd call 

> 

If W l - 1 falls 

asleep, Wi 

wakes it up 



Night 



Sleeping until 
Wi + i calls 



Figure 1: The "day" of the watchman W,, i > 0. 



3.1 High level intuition 

The idea behind our construction is simple and can be related to the saying "Who watches the 
watchmen?" (or the original latin phrase "Quis custodiet ipsos custodes?"). 

Consider a sequence of t watchmen W , W\,..., W t _\. A "day" of a watchman Wi (i > 0) 
can be described as follows (see Fig. 1): Wi watches waking it up once it falls asleep, 

and does so twice; afterwards, Wi falls asleep itself. The watchman Wo instead will simply fall 
asleep directly after it has been woken up. Now if each watchman is awake in the beginning of 
this process (or even just W t -i), it is clear that Wq will be woken up 2°^ times by the time 
that every watchman is asleep. 

In the construction we have a sequence of gadgets Qq, Qi, . . . Qt-i, where all gadgets Qi with 
i > arc identical except for the scale. Any gadget Gi {i > 0) has a fixed number of points and 
two centers, and different clusterings of its points will model which stage of the day Qi is in. 
The clustering indicating that Qi "fell asleep" has one center in a particular position S* . 

In the situation when Gi+i is awake and Qi falls asleep, some points of Qi+\ will be assigned 
temporarily to the Qi's center located in S*; in the next step this center will move so that in 
one more step the initial clustering (or "morning clustering" ) of Qi is restored: this models the 
fact that Qi+i wakes up Qi. 

Note that since each gadget has a constant number of centers, we can build an instance with 
fc clusters that has t — 0(fc) gadgets, for which fc-means will require iterations. Also since 
each gadget has a constant number of points, we can build an instance of n points and k = 6(n) 
clusters with t = 6(n) gadgets. This will imply a lower bound of 2 n ( n ) on the running time of 
fc-means. 

3.2 Definitions and further intuition 

For any i > 0, the gadget Qi is a tuple (Vi,Ci,ri, Ri) where Vi C M 2 is the set of points of 
the gadget and is defined as V% = {Pi,Qi, Ai, Bi,d, D i7 Ei} where the points have constant 
weights, while Ci is the set of initial centers of the gadget Qi and contains exactly two centers. 
Finally, rj e R + and Ri e M+ denote respectively the "inner radius" and the "outer radius" 
of the gadget, and their purpose will be explained later on. Since the weights of the points do 
not change between the gadgets, we will denote the weight of Pi (for any i > 0) with wp, and 
similarly for the other points. 

As for the "leaf" gadget Qo, the set Vo is composed of only one point F (of constant weight 
wp), and Co contains only one center. 

The set of points of the fc-means istance will be the union of the (weighted) points from 
all the gadgets, i.e. Ui=o ^ (with a total of 7(t — 1) + 1 = 0(t) points of constant weight). 
Similarly, the set of initial centers will be the union of the centers from all the gadgets, that is 
Ltd C i ( with a total of 2(t - 1) + 1 = O(t) centers). 

As we mentioned above, when one of the centers of Qi moves to a special S*, it will mean 
that Qi fell asleep. For i > we define S* as the center of mass of the cluster {Ai, Bi, d, Di}, 
while SX coincides with F. 
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Figure 2: The "day" of the gadget Qi. The diamonds denote the means of the clusters. The 
locations of the points in figure gives an idea of the actual gadget used in the proof. Also, 
the bigger the size of a point is, the bigger its weight is. 



For a gadget Qi (i > 0), we depict the stages (clusterings) it goes through during any of its 
day. The entire sequence is shown in Fig. 2. 

Morning This stage takes place right after Qi has been woken up or in the beginning of the 
entire process. The singleton {Ai} is one cluster, and the remaining points form the other 
cluster. In this configuration Qi is watching Qi-\ and intervenes once it falls asleep. 

1st call Once Qi-i falls asleep, Pi will join the Qi-is cluster with center in S*_ 1 (pt. I). At 
the next step (pt. II), Qi too will join that cluster, and Bi will instead move to the cluster 
{Ai}. The two points Pi and Qi are waking up Qi-\ by causing a restore of its morning 
clustering. 

Afternoon The points Pi, Qi and d will join the cluster {Ai, Bi}. Thus, Qi ends up with 
the clusters {Ai, Bi, C*, Pi, Qi} and {Di,Ei}. In this configuration, Qi is again watching 
Qi-\ and is ready to wake it up once it falls asleep. 

2nd CALL Once Qi-\ falls asleep, similarly to the 1st call, Pi will join the Qi-i's cluster with 
center in S*_ 1 (pt. I). At the next step (pt. II), Qi too will join that cluster, and Di will 
join the cluster {A i} Bi, Ci} (note that the other Q^s cluster is the singleton {E t }). Again, 
Pi and Qi are waking up Qi-\. 

Night At this point, the cluster {Ai, Bi,Ci, Di} is already formed, which implies that its 
mean is located in S*: thus, Qi is sleeping. However, note that Pi and Qi are still in 
some <5i_i's cluster and the remaining point Ei is in a singleton cluster. In the next step, 
concurrently with the beginning of a possible call from Q i+1 (see ft+i's call, pt.I), the points 
Pi and Qi will join the singleton {Ei}. 
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Figure 3: Gi+i's CALL: how Gi+i wakes up Gi- The distance between the two gadgets is 
actually much larger than it appears in figure. 



The two radiuses of the gadget Gi {i > 0) can be interpreted in the following way. Whenever 
Gi is watching Gi-i (either morning or afternoon), the distance between the point P and its 
mean will be exactly Ri. On the other hand, the distance between Pi and S*_ 1 - where a Gi-i's 
mean will move when Gi-i falls asleep - will be just a bit less than Ri. In this way we guarantee 
that the waking-up process will start at the right time. Also, we know that this process will 
involve Qi too, and we want the mean that was originally in S*_ 1 to end up at distance more 
than ri from Pj. In that step, one of the Gi's means will be at distance exactly r, from Pi, and 
thus Pi (and Qi too) will come back to one of the Gi's cluster. 

Now we analyze the waking-up process from the point of view of the sleeping gadget. We 
suppose that Gi (i > 0) is sleeping and that Gi+i wants to wake it up. The sequence is shown 
in Fig. 3. 

Gi+i's call Suppose that Gi+i started to waking up Gi- Then, we know that Pj + i joined 
the cluster {A i7 B i7 d, Di} (pt. I). However, this does not cause any point from this cluster 
to move to other clusters. On the other hand, as we said before, the points Pi and Qi will 
"come back" to Gi by joining the cluster {Ei}. At the next step (pt. II), Qi+i too will join 
the cluster {Ai, Bi, d, Di, Pi+i}. The new center will be in a position such that, in one 
more step (pt. Ill), Bi, d and Di will move to the cluster {Pi, Qi, Ei}. Also we know that 
at that very same step, Pj + i and Qi+i will come back to some Gi+i's cluster: this implies 
that Gi will end up with the clusters {Bi, d, Di, Ei, Pi, Qi} and {Ai}, which is exactly the 
morning clustering: Gi has been woken up. 

As for the "leaf" gadget Go, we said that it will fall asleep right after it has been woken up by 
Gi- Thus we can describe its day in the following way: 



5 



Night There is only one cluster which is the singleton {F}. The center is obviously F which 
coincides with Sq. In this configuration Go is sleeping. 

Gi : S call The point Pi from Gi joins the cluster {Po} and in the next step Qi will join the 
same cluster too. After one more step, both Pi and Q\ will come back to some Si's cluster, 
which implies that the C?o's cluster is the singleton {F} again. Thus Go, after having been 
temporarily woken up, fell asleep again. 

3.3 Formal Construction 

We start giving the distances between the points in a single gadget (intra-gadget). Afterwards, 
we will give the distances between two consecutive gadgets (inter-gadget). Henceforth XA t and 
yAi will denote respectively the x-coordinate and y-coordinate of the point A i: and analogous 
notation will be used for the other points. Also, for a set of points S, we define its total weight 
ws = ^2 xe sW x , and its mean will be denoted by fJ,(S), i.e. (J,(S) = ^ x£ J s "' x - . We suppose 
that all the weights wp,wq,wa, ■ ■ ■ have been fixed to some positive integer values, and that 
wa — wb and wp = wa + Wb + wc + w d- 

We start describing the distances between points for a non-leaf gadget. For simplicity, we 
start defining the location of the points for an hypotetical "unit" gadget Q that has unitary 
inner radius (i.e. f = 1) and is centered in the origin (i.e. P — (0, 0)). Then we will see how to 
define a gadget Gi (for any i > 0) in terms of the unit gadget G- 

The outer radius is defined as R = (1 + 6) and also we let the point Q be Q = (A, 0). The 
values < 5 < 1 and < A < 1 are constants whose value will be assigned later. The point E 
is defined as E = (0, 1). 

The remaining points are aligned on the vertical line with x-coordinate equals to 1 (formally, 
x a = x b = x c = x b = ^ s f° r ^ ne ^-coordinates, we set = —1/2 and = 1/2. 

The value y^ is uniquely defined by imposing y^ > and that the mean of the cluster 
M = {A, B, C, P, Q} is at distance R from P. Thus, we want the positive y@ that satisfies the 
equation ||^(A^)|| = R, which can be rewritten as 

( wa +w B +w c + w Q \ \ 2 ( wcVc\ =(i + 5)2 
V W M J \ w M ) 

where we used the fact that waU a + w bU b — when wa — wb- 
We easily obtain the solution 



Vc= — \/ (wm(1 + <5)) 2 - ( w a + wb +w c + wqX) 2 

Wc v 

Note that the value under the square root is always positive because A < 1. 

It remains to set y^. Its value is uniquely defined by imposing y^ > and that the mean of 
the cluster N = {B, C, D, E, P, Q} is at distance R from P. Analogously to the previous case, 
yjj is the positive value satisfying ||/i(A/")|| = R, which is equivalent to 

/ w B + w c + w D + w Q X \ 2 | / wpyjj + wg(l/2) + w c y c + w E 
V wm ) \ wm 

Now, since the equation a 2 + (b + x) 2 = c 2 has the solutions x = ±Vc 2 — a 2 — 6, we obtain the 
solution 

^ (wm(1 + S)) 2 - (w B + w c + w D + w Q X) 2 - w B /2 - w c y c - w E 

Again, the term under the square root is always positive. 

Finally, we define S* in the natural way as S* = n{A, B, C, D}. 

Now consider a gadget Gi with i > 0. Suppose to have fixed the inner radius and the center 
Pi. Then we have the outer radius Ri = (1 + £)r,, and we define the location of the points in 
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terms of the unit gadget by scaling of n and translating by P; in following way: Aj = P; + r,A, 
Bi = Pi + TiB , and so on for the other points. 

As for the gadget Go, there are no intra-gadget distances to be defined, since it has only one 
point F. 

For any i > 0, the intra-gadget distances in Gi have been defined (as a function of Pj, r*, 
6 and A). Now we define the (inter-gadget) distances between the points of two consecutive 
gadgets Qi and Gi+i, for any i > 0. We do this by giving expliciting recursive expressions for n 
and Pi. 

For a point Z £ {A,B,C,D}, we define the "stretch" of Z (from S* with respect to 
H{E,P,Q}) as 



a(Z) = ^/d 2 (Z, ^{F, P, Q}) - d 2 (Z, 5*) 

The stretch will be a real number (for all points A, B,C, D), given the values A, 8 and the 
weights used in the construction. 

We set the inner radius ro of the leaf gadget Go to a positive arbitrary value, and for any 
i > 0, we define 

Ti W F +Wp+WQ ~ 

1 + d tup + (1 + Ajwq 

where we remind that wp = wa +% + wc + i"u ■ 

Now recall that S* — n{A l , B ll C il A} for any i > 0, and SjJ = ^{F} = P. Assuming to 
have fixed the point P somewhere in the plane, we define for any i > 

£p 4 = a; s? _ i +R i (l- e ) (2) 

where < e < 1 is some constant to define. Note that now the instance is completely defined 
in function of A, 6, e and the weights. We are now ready to prove the lower bound. 



3.4 Proof 

We assume that the initial centers - that we seed fc-means with - correspond to the means of 
the "morning clusters" of each gadget Gi with i > 0. Namely, the initial centers are /j,{Aj}, 
ft{Bi, d, D i} Ei,P i} Qi} for all i > 0, in addition to the center ^{F} = F for the leaf gadget Go- 
In order to establish our result, it is enough to show that there exist positive integer values 
WA,WB,wc,wp>,WE,wp,wp,WQ (with u>a — wp) and values for A, 5 and e, such that the 
behavior of fc-means on the instance reflects exactly the clustering transitions described in 
Section EOl The chosen values (as well as other derived values used later in the analysis) are in 
Table [TJ The use of rational weights is not restrictive, because the mean of a cluster (as well as 
fc-means' behavior) does not change if we multiply the weights of its points by the same factor 
- in our case it is enough to multiply all the weights by 100 to obtain integer weights. 
Finally, for the value of e, we impose 

. f d 2 (S*,C) A a(A)~a(B) (1 + Xw Q )(w F + w P + wq) 

< e < mm < — —r- , , , f 

| (l + (5) 2 1 + 8 1 a {A) (1 + 8)wf 

Throughout the proof, we will say that a point Z in a cluster C is stable with respect to 
(w.r.t) another cluster C, if d(Z,/j,(C)) < d(Z, /x(C')). Similarly a point Z in a cluster C is 
stable if Z is stable w.r.t. any C ^ C. Also, similar definitions of stability extends to a cluster 
(resp. clustering) if the stability holds for all the points in the cluster (resp. for all the clusters 
in the clustering). 

We consider an arbitrary gadget Gi with i > in any stage of its day (some clustering) , and 
we show that the steps that fc-means goes through are exactly the ones described in Section [3T2l 
for that stage of the day (for the chosen values of A, 8, e and weights). For the sake of convenience 
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Chosen values 


Unit gadget 


Other derived values used in the proof 


6 = 0.25 


f = 1 


(0.1432, 1.0149) X N X (1.44, 1.015) 


A = 10 -5 


(1 + 5) = 1.025 


(0.9495,0.386) X M X (0.9496,0.3861) 


Wp — 1 


£ = (o,o) 


1.003 < a < 1.004 


WQ = 10" 2 


Q = (A,0) = (10- 5 ,0) 


1.0526 < P < 1.05261 


w A = 4 


i= (1,-0.5) 


0.99 < 7 < 0.99047 


wb = 4 


B = (1,0.5) 


1.0003 < a (A) < 1.0004 


Wq = 11 


(1,0.70223) X C X (1,0.70224) 


1.0001 < a{B) < 1.0002 


wd = 31 


(1, 1.35739) X £» X (1, 1.3574) 


1 < cf{C) < 1.0001 


we = 274 


^=(0,1) 


0.9999 < a{D) < 0.99992 



Table 1: The relation X denotes the less-or-equal component-wise relation. 



and w.l.o.g, we assume that Gi has unitary inner radius (i.e. = f = 1 and P,; = R = (1 + 5)) 
and that Pi is in the origin (i.e. Pi — (0,0)). 

Morning 

We need to prove that the morning clustering of Gi is stable assuming that Gi-i is not sleeping. 
Note that this assumption implies that i > 1 since the gadget Go is always sleeping when Gi is 
in the morning. Since the singleton cluster {Ai} is trivially stable, we just need to show that 
N = {Bi, d,Di, Ei,Pi, Qi} is stable. It is easy to understand that it suffices to show that Bi, 
Qi and Pi are stable w.r.t {^4,} (the other points in TV are further from Ai), and that P, is 
stable w.r.t any Gi-i's cluster. Letting N = fJ,(Af), we have xn — [wb + wc + w d + ^ujq)/w^, 
and y N = + S) 2 - x 2 N . 

The point Pi is stable w.r.t. {Ai}, since d(P h N) = (1 + 6) < ^l 2 + (0.5) 2 = d(P h Ai). To 
prove the same for Q i; note that d(Qi, Ai) = \J (\ — A) 2 + (0.5) 2 > R, while on the other hand 
x n > XQi implies d(Qi,N) < R. 

As for B h d 2 (Bi,N) = (x B - x N ) 2 + (y B - Vn) 2 = \\Bi\\ 2 + R 2 - 2(x N x Bz + VnVbJ- Thus, 
the inequality d(Bi, N) < d(Bi, Ai) = 1 simplifies to 5/4 + R 2 — 2x^ — Vn < 1, which can be 
checked to be valid. 

It remains to prove that is stable w.r.t. any Gi~i's cluster. It is easy to understand that, 
in any stage of Gi-i's day (different from the night), the distance from any Gi-x's center to P, is 
more than the distance between Cj_i and P;. We observe that d 2 (Pi, Cj_i) = (xp i — x$* J 2 + 

<f{St_ x ,Ci_ x ) =P 2 (l-e) 2 + fd 2 (S , *_ 1 ,C'), using ©. The assumption e < d 2 (S*, C)/(l + S) 2 
directly implies d 2 (P i ,C i - 1 ) > (1 + 5) = d(Pi,N). 

1st Call 

We start analyzing the part I of this stage. Since we are assuming that Gi—i is sleeping, there 
must be some Gi-i& cluster C with center in S*_ 1 (note that Gi-i can be the leaf gadget Go 
as well). By ([2]) we have d(Pi, S*^) < Ri, and so Pi will join C. We claim that Qi (any other 
Gi& point is implied) is instead stable, i.e. d(Qi,N) < d(Qi, S^-i)- We already know that 
d{Qi, N) < R, so we show d(Q i ,S*_ 1 ) > R. Using we have R(l - e) + Xf > R, which holds 
for e < A/(l + 8). 

We now analyze the next iteration, i.e. the part II of this stage. We claim that Qi will 
join C U {Pi}, and Bi will join {Ai}. To establish the former, we show that d(Qi, fi(J\f')) > R 
where M' = M — {Pi}- Since Pi is in the origin, we can write N' — aN with a = wj\f/wj\fr. 
Thus, the inequality we are interested in is (A — ax^) 2 + {<^Vn) 2 > R 2 which can rewritten 
as (a 2 — 1)R > 2XaxN. Finally, since a > 1,R > 1 and xn < 1, the inequality is implied by 
a(l — 2A) > 1, which holds for the chosen values. 

It remains to prove that Bi is not stable w.r.t. {Ai}, i.e. d(Bi,N') > d(Bi,Ai) = 1. Again, 
starting with the inequality (1 — ax^) 2 + (1/2 — ayx) 2 > 1, we get the equivalent inequality 
1/4 + a 2 R > a(2xN + un), which is easy to verify. 
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Finally, we prove that Ci is instead stable w.r.t. N' . Similarly we get x c . + y c . + a 2 R 2 — 
2a(x N x Ci + UNUCi) < (VAi ~ Vc t ), which is implied by 3/4 + a 2 R 2 < ycA 1 + ^ay N ). 

Afternoon 

The last stage ended up with the QiS clusters J\f" = {Ci, Di, E{\ and {Ai, Bi}, since Pi and Qi 
both joined the cluster C of Gi-i- We claim that, at this point, Pi, Qi and Ci are not stable and 
will all join the cluster {Ai, Bi}. 

Let C = C U {Pi,Qi}; note that the total weight wc of the cluster C is the same if Gi-i 
is the leaf gadget Qq or not, since by definition of wc = wp = wa +wb + Wc + wd- We start 
showing that d(Pi, fi(C')) > f = 1 which proves that the claim is true for P, and Qi. By defining 
d = xp L —x sr_ , the inequality can be rewritten as d— (wpd + WQ(d + X))/wc> > 1, which by @ 
is equivalent to (l — e)(l + 5)wc/'Wc' > 1 + Xwq. It can be checked that (l + J)^/^' > 1 + Xwq 
and the assumption on e completes the proof. 

Now we prove that Cj is not stable w.r.t to {Ai, Bi}, by showing that d(Ci, N") > yc t where 
N" = fi(Af"). Note that the inequality is implied by xc i — xn" > yen which is equivalent to 
we/wm" > yd that holds for the chosen values. 

At this point, analogolously to the morning stage, we want to show that this new clustering 
is stable, assuming that Gi-i is not sleeping. Note that the analysis in the morning stage directly 
implies that Pi is stable w.r.t any Qi-x's cluster. It can be shown as well that Pj is stable w.r.t 
to Af'" — {Di,Ei}, and Di is stable w.r.t. M = {Ai, Bi,d, Pi,Qi} (other points' stability is 
implied). 

2nd Call 

For the part I of this stage, i.e. we assume Qi~\ is sleeping, and so there is some t/i-i's cluster 
C with center in S*_ 1 . Similarly to the 1st call (part I), Pj will join C. The point Qi is instead 
stable, since we proved d(Qi, S^-i) > ^> while xm > XQ i implies d(Qi, M) < R. 

We now analyze the next iteration, i.e. the part II of this stage. We claim that Qi will join 
C U {Pi}, and Di will join M' = M — {Pi}- This can be proven analogously to the part II of 
the first call, by using M' = n{M') — (3M, where /3 — wm/w_m'- 

Night 

The last stage leaves us with the clusters {Ai, Bi, d, Di} and the singleton {Ei}. We want to 
prove that in one iteration Pi and Qi will join {Ei}. In the afternoon stage, we already proved 
that d(Pi, (i(C)) > r, and since d{Pi, Ai) = f = 1, the point Pi will join {Ei}. For the point Qi, 
we have d(Qi,fi(C')) = d{P l ,^(C')) + A > f + A, while d{Q t ,Ei) = y/f 2 + X 2 < f + X. Thus, the 
point Qi, as well as Pi, will join {Ei}. 

Qi+l'S CALL 

In this stage, we are analyzing the waking-up process from the point of view of the sleeping 
gadget. We suppose that Qi {i > 0) is sleeping and that Gi+i wants to wake it up. 

We start considering the part I of this stage, when only P^ + i joined the cluster S = 
{Ai, Bi,Ci, Di}. Let S' = S U {Pj + i}. We want to verify that the points in S are stable 
w.r.t. {Ei,Pi,Qi}, i.e. that for each Z eS, d(Z,^(S')) < d(Z,^{Ei,P l ,Q i }). This inequality 
is equivalent to d(S* , [i(S')) < a(Z), and given the ordering of the stretches, it is enough to 
show it for Z = D. By we have that d(S* , /i(S')) = (1 — e)R i+1 wp/ws> , and using ([T]) we 
get d(S*,fi(S')) = f(l - e)~fa(A) where 7 = (wp/w 5 ')( w 5' + w Q )/(w P + (1 + X)w Q ). Finally, 
it is easy to verify that ^^(A) < a(D). 

In the part II of this stage, Qi+i joined S' . Let S" — S' U {Q1+1}.. We want to verify that 
all the points in S but A will move to the cluster {Ei, Pi, Qi}. 

We start showing that d(Ai, fi(S")) < d(Z , fi{Ei, Pi,Qi}). This inequality is equivalent to 
d{S*,n(S")) < cr(i), andwehaveeZ(5*,/x(«S")) = (l-e)R l+1 (wp + (l + X)wQ)/(wp + WQ + w F ). 
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Using (p} to substitute B4+1, we get d(S* , fj,(S")) = (1 — e)<r(A), which proves that Ai will not 
change cluster. 

Similarly, we want to prove that, for Z 6 S, Z ^ A, it holds that d(S* , fj,(S")) = (1 — 
e)a(A) > <j(Z). Given the ordering of the stretches, it suffices to show it for Z = B. Recalling 
that e < (u{A) — a(B))/a(A), the proof is concluded. 

3.5 Extensions 

The proof in the previous section assumed that the set of initial centers correspond to the means 
of the "morning clusters" for each gadget Qi with i > 0. A common initialization for fc-means 
is to choose the set of centers among the data points. We now briefly explain how to modify 
our instance so to have this property and the same number of iterations. 

Consider the unit gadget Q for simplicity. One of the center will be the point E. In the 
beginning we want all the points of Q except A to be assigned to E. To obtain this, we 
will consider two new data points each with a center on it. Add a point (and center) / with 
Xj = x 1 — 1 and such that y^ — yj is slightly less than d(A, E). In this way A will be assigned 
to this center. Also, we add another point (and center) J very close to / (but further from ^4) 
so that, when B joins the cluster {/} moving the center towards itself, the point / will move 
to the cluster {J}. By modifying in this way all the gadgets in the instance, we will reach the 
morning clustering of each gadget in two steps. Also it is easy to check that the new points do 
not affect the following steps. 

Har-Peled and Sadri [9] conjectured that, for any dimension d, the number of iterations of 
fc-means might be bounded by some poynomial in the number of point n and the spread A (A 
is ratio between the largest and the smallest pairwise distance). 

This conjecture was already disproven in [2] for d = fl(^/n). By using the same argument, 
we can modify our construction to an instance in d = 3 dimension having linear spread, for 
which fc-means requires 2™ n ' iterations. Thus, the conjecture does not hold for any d > 3. 

4 Conclusions and further discussion 

We presented how to construct a 2-dimensional instance with fc clusters for which the fc-means 
algorithm requires iterations. For k = O(n), we obtain the lower bound 2 f2 w. Our result 
improves the best known lower bound [2] in terms of number of iterations (which was 2 fJ ( v/ ™'), 
as well as in terms of dimensionality (it held for d = Q(y/n)). 

We observe that in our construction each gadget uses a constant number of points and wakes 
up the next gadget twice. For fc = o(n), we could use 0(n/fc) points for each gadget, and it 
would be interesting to see if one can construct a gadget with such many points that is able to 
wake up the next one Sl{n/k) times. Note that this would give the lower bound (n/fc) a( ™/ fe ), 
which for fc — n c (0 < c < 1), simplifies to n n ( k \ This matches the optimal upper bound 
0(n kd ), as long as the construction lies in a constant number of dimensions. 

A polynomial upper bound for the case d = 1 has been recently proven in the smoothed 
regime [T3]. It is natural to ask if this result can be extended to the ordinary case. 
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